Semantic Web Enabled Record Linkage Attacks on Anonymized Data
نویسندگان
چکیده
Big Data analytics holds the promise of enabling new discoveries in medicine, more efficient business practices, and other important advances. However, much of the data involved in such analyses contains personally identifiable information (PII) that needs to be removed or obscured prior to release in order to protect individuals’ privacy. Anonymizing a dataset is not as easy as it seems, however, and many supposedly anonymous datasets are vulnerable to a record linkage attack. Most of these attacks are currently conducted manually and can be labor-intensive, but as semantic web technologies continue to gain popularity, the potential for automating various aspects of these attacks increases. This paper explores the components of a record linkage attack and how semantic web technologies could play a role in facilitating them.
منابع مشابه
Evaluation of the disclosure risk of masking methods dealing with textual attributes
Record linkage methods evaluate the disclosure risk of revealing confidential information in anonymized datasets that are publicly distributed. Concretely, they measure the capacity of an intruder to link records in the original dataset with those in the masked one. In the past, masking and record linkage methods have been developed focused on numerical or ordinal data. Recently, motivated by t...
متن کاملExploiting Secondary Sources for Unsupervised Record Linkage
XML, Web services, and the Semantic Web have opened the door for new and exciting information integration applications. Information sources on the web are controlled by different organizations or people, utilize different text formats, and have varying inconsistencies. Therefore, any system that integrates information from different data sources must identify common entities from these sources....
متن کاملData conversion, extraction and record linkage using XML and RDF tools in Project SIMILEc
SIMILE is a joint project between MIT Libraries, MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), HP Labs and the World Wide Web Consortium (W3C). It is investigating the application of Semantic Web tools, such as the Resource Description Framework (RDF), to the problem of dealing with heterogeneous metadata. This report describes how XML and RDF tools are used to perform da...
متن کاملA fault-tolerant cryptographic protocol for patient record requests
The ARTEMIS research project aims at providing an interoperability framework for health care IT based on semantic Web services. One important issue is to identify patients across organisations’ boundaries to allow for an exchange of medical data in conformance with privacy regulations. The ‘Patient Identification Process Protocol’, which is based on a method used in the Cancer Registry of Lower...
متن کاملLeveraging Social Media Signals for Record Linkage
Many data-intensive applications collect (structured) data from a variety of sources. A key task in this process is record linkage, which is the problem of determining the records from these sources that refer to the same real-world entities. Traditional approaches use the record representation of entities to accomplish this task. With the nascence of social media, entities on the Web are now a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016